Methods and systems for picture resampling

ABSTRACT

Embodiments of the present invention comprise systems and methods for picture up-sampling and picture down-sampling. Some embodiments of the present invention provide an up-sampling and/or down-sampling procedure designed for the Scalable Video Coding extension of H.264/MPEG-4 AVC.

RELATED REFERENCES

This application claims the benefit of U.S. Provisional PatentApplication No. 60/738,136, entitled “Methods and Systems forUp-sampling and Down-sampling of Interlaced Materials,” filed Nov. 18,2005, invented by Shijun Sun.

FIELD OF THE INVENTION

Embodiments of the present invention comprise methods and systems forpicture resampling. Some embodiments of the present invention comprisemethods and systems for picture resampling for spatially scalable videocoding.

BACKGROUND

H.264/MPEG-4 AVC [Joint Video Team of ITU-T VCEG and ISO/IEC MPEG,“Advanced Video Coding (AVC)-4^(th) Edition,” ITU-T Rec. H.264 andISO/IEC 14496-10 (MPEG4-Part 10), January 2005], which is incorporatedby reference herein, is a video codec specification that uses macroblockprediction followed by residual coding to reduce temporal and spatialredundancy in a video sequence for compression efficiency. Spatialscalability refers to a functionality in which parts of a bitstream maybe removed while maintaining rate-distortion performance at anysupported spatial resolution. Single-layer H.264/MPEG-4 AVC does notsupport spatial scalability. Spatial scalability is supported by theScalable Video Coding (SVC) extension of H.264/MPEG-4 AVC.

The SVC extension of H.264/MPEG-4 AVC [Working Document 1.0 (WD-1.0)(MPEG Doc. N6901) for the Joint Scalable Video Model (JSVM)], which isincorporated by reference herein, is a layered video codec in which theredundancy between spatial layers is exploited by inter-layer predictionmechanisms. Three inter-layer prediction techniques are included intothe design of the SVC extension of H.264/MPEG-4 AVC: inter-layer motionprediction, inter-layer residual prediction, and inter-layer intratexture prediction.

Previously, only dyadic spatial scalability was addressed by SVC. Dyadicspatial scalability refers to configurations in which the ratio ofpicture dimensions between two successive spatial layers is a power of2. New tools that manage configurations in which the ratio of picturedimensions between successive spatial layers is not a power of 2 and inwhich the pictures of the higher level can contain regions that are notpresent in corresponding pictures of the lower level, referred to asnon-dyadic scaling with cropping window, have been proposed.

All of the inter-layer prediction methods comprise picture up-sampling.Picture up-sampling is the process of generating a higher resolutionimage from a lower resolution image. Some picture up-sampling processescomprise sample interpolation. The prior up-sampling process used in theSVC design was based on the quarter luma sample interpolation procedurespecified in H.264 for inter prediction. When applied to spatiallyscalable coding, the prior method has the following two drawbacks: theinterpolation resolution is limited to quarter samples, and thus, is notsupportive of non-dyadic scaling; and half-sample interpolation isrequired in order to get a quarter-sample position making this methodcomputationally cumbersome. A picture up-sampling process that overcomesthese limitations is desired.

SUMMARY

Some embodiments of the present invention are related to the ScalableVideo Coding (SVC) extension of H.264/AVC. The SVC extension of H.264currently (in Joint Draft version 4) only addresses spatial scalabilitybetween progressive video sequences (or frames).

Some embodiments of the present invention relate to the resampling(down-/up-sampling) processes involving interlaced materials.

Some embodiments of the present invention comprise picture up-samplingaccomplished through direct interpolation using filter coefficientsselected based on the phase of the location of the pixel to beinterpolated.

The foregoing and other objectives, features, and advantages of theinvention will be more readily understood upon consideration of thefollowing detailed description of the invention taken in conjunctionwith the accompanying drawings.

BRIEF DESCRIPTION OF THE SEVERAL DRAWINGS

FIG. 1 is diagram illustrating the geometrical relationship between anenhancement layer and a base layer;

FIG. 2 is a diagram showing the relative location of a sample in anenhancement layer and a base layer;

FIG. 3 is a flow diagram of an embodiment of the present inventioncomprising interpolation filtering in two directions;

FIG. 4 is a diagram illustrating the relationship between macroblocks inan enhancement layer and a base layer;

FIG. 5 is a diagram illustrating the relationship between macroblocks inan enhancement layer and a base layer;

FIG. 6 is a diagram illustrating an embodiment of the present inventioncomprising field interpolation;

FIG. 7 is a diagram illustrating an embodiment of the present inventioncomprising field interpolation and filter data selection based on sampleposition;

FIG. 8 is a diagram illustrating an embodiment of the present inventioncomprising field interpolation and filter data selection based on sampleposition phase; and

FIG. 9 is a diagram illustrating an embodiment of the present inventioncomprising resampling mode signaling.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

Embodiments of the present invention will be best understood byreference to the drawings, wherein like parts are designated by likenumerals throughout. The figures listed above are expressly incorporatedas part of this detailed description.

It will be readily understood that the components of the presentinvention, as generally described and illustrated in the figures herein,could be arranged and designed in a wide variety of differentconfigurations. Thus, the following more detailed description of theembodiments of the methods and systems of the present invention is notintended to limit the scope of the invention but it is merelyrepresentative of the presently preferred embodiments of the invention.

Elements of embodiments of the present invention may be embodied inhardware, firmware and/or software. While exemplary embodiments revealedherein may only describe one of these forms, it is to be understood thatone skilled in the art would be able to effectuate these elements in anyof these forms while resting within the scope of the present invention.

H.264/MPEG-4 AVC [Joint Video Team of ITU-T VCEG and ISO/IEC MPEG,“Advanced Video Coding (AVC)-4^(th) Edition,” ITU-T Rec. H.264 andISO/IEC 14496-10 (MPEG4-Part 10), January 2005] is incorporated hereinby reference.

The SVC extension of H.264/MPEG-4 AVC [Working Document 1.0 (WD-1.0)(MPEG Doc. N6901) for the Joint Scalable Video Model (JSVM)] isincorporated herein by reference.

For the purposes of this specification and claims, the term “picture”may comprise an array of pixels, a digital image, a subdivision of adigital image, a data channel of a digital image or anotherrepresentation of image data.

FIG. 1 shows two pictures corresponding to an image picture: a lowerspatial picture 10, also referred to as a base spatial picture orbase-layer picture, and a higher spatial picture 100, also referred toas an enhancement spatial picture or enhancement-layer picture. The basespatial picture 10 may have lower spatial resolution than theenhancement spatial picture 100, as shown in FIG. 1. The base spatialpicture 10 may not include the same spatial region as that of theenhancement spatial picture 100, as shown in FIG. 1. Shown in FIG. 1 isa base spatial picture 10 corresponding to a spatial region 110 croppedfrom the enhancement spatial picture 100. The base-layer picture 10 maybe an interlaced-frame picture comprising an upper and a lower field, ora progressive-frame picture. The enhancement-layer picture 100 may be aninterlaced-frame picture comprising an upper and a lower field, or aprogressive-frame picture

In some embodiments of the present invention, the base spatial pictureand the enhancement spatial picture may correspond to two spatial layersin a scalable video coder/decoder (codec).

The width 101 of the enhancement spatial picture 100 and the height 102of the enhancement spatial picture 100 may be denoted w_(enh) andh_(enh), respectively. The width 11 and the height 12 of the basespatial picture 10 may be denoted w_(base) and h_(base) respectively.The base spatial picture 10 may be a sub-sampled version of a sub-region110 of the enhancement spatial picture 100 positioned at enhancementspatial picture coordinates (x_(orig), y_(orig)) 103. The position 103represents the position of the upper-left corner of the cropping window110. The width 111 and the height 112 of the sub-region 110 may bedenoted w_(extract) and h_(extract), respectively. The parameters(x_(orig), y_(orig), w_(extract), h_(extract), w_(base)h_(base)) definethe relationship between the higher spatial picture 100 and the lowerspatial picture 10.

Picture up-sampling may refer to the generation of a higher spatialresolution image from a lower spatial resolution image. In someembodiments, up-sampling may refer to increasing resolution in anydimension not limited to spatial dimensions or a temporal dimension.FIG. 2 shows a pixel location 220 in a higher spatial resolution image200. In FIG. 2, pixel location 220 has a corresponding location 22 inthe lower spatial resolution image 20. The location 220 may aligndirectly with a pixel location in the lower spatial resolution image 20,or it may not align directly with a pixel location in the lower spatialresolution image 20. In FIG. 2, the location 22 is shown located betweenfour base-layer pixels, 21, 23, 24, and 25.

Some embodiments of the present invention comprise methods and systemsfor direct interpolation of the pixels of the enhancement spatialpicture 200 given the base spatial picture 20 wherein the ratios ofdimensions are not limited to a power of 2. Some embodiments of thepresent invention comprise up-sampling on the entire picture of the basespatial picture 20. Other embodiments of the present invention compriseblock-by-block up-sampling of the base spatial picture 20. Someembodiments of the present invention comprise up-sampling in onedirection followed by up-sampling in another direction. For up-sampling,determining the corresponding location in the base-layer picture of asample position in the enhancement layer may be required. Thisdetermination may depend on the format, progressive or interlace, of thebase-layer picture and the enhancement-layer picture.

Some embodiments of the present invention comprise methods and systemsfor direct interpolation of the pixels of the base spatial picture 20given the enhancement spatial picture 200 wherein the ratios ofdimensions are not limited to a power of 2. Some embodiments of thepresent invention comprise down-sampling on the entire picture of theenhancement spatial picture 200. Other embodiments of the presentinvention comprise block-by-block down-sampling of the base spatialpicture 200. Some embodiments of the present invention comprisedown-sampling in one direction followed by down-sampling in anotherdirection. For down-sampling, determining the corresponding location inthe enhancement-layer picture of a sample position in the base layer maybe required. This determination may depend on the format, progressive orinterlace, of the base-layer picture and the enhancement-layer picture.

Progressive-Material Embodiments

If both the base-layer picture and the enhancement-layer picture areprogressive frames, then for a sample position (x, y) in the enhancementspatial picture in units of integer samples, the corresponding position(p_(x,L)(x), p_(y,L)(y)) in the base spatial picture, in units of$\frac{1}{R_{L}}$samples, may be given by: $\begin{matrix}\left\{ {\begin{matrix}{{p_{x,L}(x)} = {\left\lbrack {{\left( {x - x_{orig}} \right) \cdot w_{base} \cdot R_{L}} + {\frac{R_{L}}{2}\left( {w_{base} - w_{extract}} \right)}} \right\rbrack/w_{extract}}} \\{{p_{y,L}(y)} = {\left\lbrack {{\left( {y - y_{orig}} \right) \cdot h_{base} \cdot R_{L}} + {\frac{R_{L}}{2}\left( {h_{base} - h_{extract}} \right)}} \right\rbrack/h_{extract}}}\end{matrix},} \right. & (1)\end{matrix}$where the parameters (x_(orig), y_(orig), w_(extract), h_(extract),w_(base), h_(base)) define the relationship between the higher spatialpicture 100 and the lower spatial picture 10, as in FIG. 1, and R_(L) isthe interpolation resolution. Some embodiments compriseone-sixteenth-sample resolution interpolation, and in such embodimentsR_(L) is 16.

In some embodiments, the corresponding position (p_(x,L)(x), p_(y,L)(y))in the base spatial picture 10 may be given by: $\begin{matrix}\left\{ {\begin{matrix}{{p_{x,L}(x)} = {\left\lbrack {{\left( {x - x_{orig}} \right) \cdot w_{base} \cdot R_{L}} + {\frac{R_{L}}{2}\left( {w_{base} - w_{extract}} \right)}} \right\rbrack//w_{extract}}} \\{{p_{y,L}(y)} = {\left\lbrack {{\left( {y - y_{orig}} \right) \cdot h_{base} \cdot R_{L}} + {\frac{R_{L}}{2}\left( {h_{base} - h_{extract}} \right)}} \right\rbrack//h_{extract}}}\end{matrix},} \right. & \left( {1a} \right)\end{matrix}$

where, as above, the parameters (x_(orig), y_(orig), w_(extract),h_(extract), w_(base), h_(base)) define the relationship between thehigher spatial picture 100 and the lower spatial picture 10, as in FIG.1, R_(L) is the interpolation resolution, and “//” comprises acomputationally simplified division operation, such as integer divisionwith rounding. The sample positions may not be limited to powers of 2,and direct calculation of a sample position allows for directinterpolation of picture values at that sample position.

For interpolation resolution of R_(L)=16, the corresponding position inthe base-layer picture becomes: $\begin{matrix}\left\{ {\begin{matrix}{{p_{x,L}(x)} = {\left\lbrack {{\left( {x - x_{orig}} \right) \cdot w_{base} \cdot 16} + {8 \cdot \left( {w_{base} - w_{extract}} \right)}} \right\rbrack/w_{extract}}} \\{{p_{y,L}(y)} = {\left\lbrack {{\left( {y - y_{orig}} \right) \cdot h_{base} \cdot 16} + {8 \cdot \left( {h_{base} - h_{extract}} \right)}} \right\rbrack/h_{extract}}}\end{matrix},} \right. & \left( {1b} \right)\end{matrix}$as described above with R_(L)=16.

For a chroma sample position (x_(c), y_(c)) in the enhancement picturein units of single chroma samples, its corresponding position in thebase picture (p_(x,c)(x_(c)), p_(y,c)(y_(c))) in units of one-sixteenthchroma samples of the base picture may be derived as: $\begin{matrix}\left\{ {\begin{matrix}{{p_{x,c}\left( x_{c} \right)} = {\left\lbrack {{\left( {x_{c} - x_{{orig},c}} \right) \cdot w_{{base},c} \cdot R_{C}} + {\frac{R_{C}}{4}\left( {2 + p_{{enh},x}} \right)w_{{base},c}} - {\frac{R_{C}}{4}\left( {2 + p_{{base},x}} \right)w_{{extract},c}}} \right\rbrack//w_{{extract},c}}} \\{{p_{y,c}\left( x_{c} \right)} = {\left\lbrack {{\left( {y_{c} - y_{{orig},c}} \right) \cdot h_{{base},c} \cdot R_{C}} + {\frac{R_{C}}{4}\left( {2 + p_{{enh},y}} \right)h_{{base},c}} - {\frac{R_{C}}{4}\left( {2 + p_{{base},y}} \right)h_{{extract},c}}} \right\rbrack//h_{{extract},c}}}\end{matrix},} \right. & (2)\end{matrix}$in which R_(C)=16, (x_(orig,c), y_(orig,c)) represents the position ofthe upper-left corner of the cropping window in the current picture inunits of single chroma samples of current picture, (w_(base,c),h_(base,c)) is the resolution of the base picture in units of singlechroma samples of the base picture, (w_(extract,c), h_(extract,c)) isthe resolution of the cropping window in units of the single chromasamples of current picture, (p_(base,x), p_(base,y)) represents therelative chroma phase shift of the base picture in units of quarterchroma samples of the base picture, and (p_(enh,x), p_(enh,y))represents the relative chroma phase shift of the current picture inunits of quarter chroma samples of the current picture.

Similarly, when the interpolation resolution is R_(L)=16, thecorresponding position in the base picture (p_(x,c), p_(y,c)) in unitsof 1/16 chroma samples of the base picture becomes: $\begin{matrix}\left\{ {\begin{matrix}{{p_{x,c}\left( x_{c} \right)} = {\left\lbrack {{\left( {x_{c} - x_{{orig},c}} \right) \cdot w_{{base},c} \cdot 16} + {{4 \cdot \left( {2 + p_{{enh},x}} \right)}w_{{base},c}} - {{4 \cdot \left( {2 + p_{{base},x}} \right)}w_{{extract},c}}} \right\rbrack//w_{{extract},c}}} \\{{p_{y,c}\left( x_{c} \right)} = {\left\lbrack {{\left( {y_{c} - y_{{orig},c}} \right) \cdot h_{{base},c} \cdot 16} + {{4 \cdot \left( {2 + p_{{enh},y}} \right)}h_{{base},c}} - {4\left( {2 + p_{{base},y}} \right)h_{{extract},c}}} \right\rbrack//h_{{extract},c}}}\end{matrix},} \right. & \left( {2a} \right)\end{matrix}$where (x_(orig,c), y_(orig,c)) represents the position of the upper-leftcorner of the cropping window in the enhancement picture in units ofsingle chroma samples of the current picture, (w_(base,c), h_(base,c))is the resolution of the base picture in units of single chroma samplesof the base picture, (w_(extract,c), h_(extract,c)) is the resolution ofthe cropping window in units of the single chroma samples of enhancementpicture, (p_(base,x), p_(base,y)) represents the relative chroma phaseposition of the base picture in units of quarter chroma samples of thebase picture, and (p_(enh,x), p_(enh,y)) represents the relative chromaphase position of the enhancement picture in units of quarter chromasamples of the enhancement picture.

Based on the phase positions derived by Eqs. 1-2, up-sampling filterscan be chosen, in some embodiments, from a pre-defined filter table forthe interpolation process. Details are described in a section below.Other filter selection and/or calculation processes, which are relatedto the phase position, may also be used.

Given a luma sample in the low-resolution picture at position (x, y) inunits of single luma samples of the low-resolution picture, itscorresponding position in the high-resolution picture (p_(x,L), p_(y,L))in units of one-sixteenth luma samples of the high-resolution picturemay be derived as: $\begin{matrix}\left\{ {\begin{matrix}{p_{x,L} = {{{16 \cdot x_{orig}} + \left\lbrack {{x \cdot w_{extract} \cdot 16} + {8 \cdot \left( {w_{extract} - w_{base}} \right)}} \right\rbrack}//w_{base}}} \\{p_{y,L} = {{{16 \cdot y_{orig}} + \left\lbrack {{y \cdot h_{extract} \cdot 16} + {8 \cdot \left( {h_{extract} - h_{base}} \right)}} \right\rbrack}//h_{base}}}\end{matrix}.} \right. & (3)\end{matrix}$

Given a chroma sample in the low-resolution picture at position (x_(c),y_(c)) in units of single chroma samples of the low-resolution picture,its corresponding position in the high-resolution picture p_(x,c),p_(y,c)) in units of one-sixteenth chroma samples of the high-resolutionpicture may be derived as: $\begin{matrix}\left\{ \begin{matrix}{p_{x,c} = {{{16 \cdot x_{{orig},c}} + \left\lbrack {{x \cdot w_{{extract},c} \cdot 16} + {{4 \cdot \left( {2 + p_{{base},x}} \right)}w_{{extract},c}} - {4\left( {2 + p_{{enh},x}} \right)w_{{base},c}}} \right\rbrack}//w_{base}}} \\{p_{y,c} = {{{16 \cdot y_{{orig},c}} + \left\lbrack {{y \cdot h_{{extract},c} \cdot 16} + {{4 \cdot \left( {2 + p_{{base},y}} \right)}h_{{extract},c}} - {4\left( {2 + p_{{enh},y}} \right)h_{{base},c}}} \right\rbrack}//{h_{base}.}}}\end{matrix} \right. & (4)\end{matrix}$

Based on sampling position derived from Eqs. 3-4, down-sampling filterscan be selected from a pre-defined set of filters, in some embodiments.However, the down-sampling process is not a normative part of the SVC.

Interlaced Material Embodiments

Some embodiments of the present invention are related to the ScalableVideo Coding (SVC) extension of H.264/AVC. The SVC extension of H.264currently (in Joint Draft version 4) only addresses spatial scalabilitybetween progressive video sequences (or frames). Some embodiments of thepresent invention address issues related to the resampling(down-/up-sampling) processes involving interlaced materials.

Some embodiments of the present invention comprise solutions forfield-to-frame, frame-to-field, and field-to-field resampling processes.Some of these embodiments relate to a direct interpolation method withan interpolation resolution of 1/16 sample, however other interpolationresolutions and methods may be used.

Resampleing of Interlaced Materials

In some embodiments of the present invention, which handle interlacedmaterials, it may be assumed that all parameters (x_(orig), y_(orig),w_(extract), h_(extract), w_(base), h_(base)), (p_(base,x), p_(base,y)),and (p_(enh,x), p_(enh,y)) be defined as above for frame-based pictures;additionally, (y_(orig), h_(extract)) shall be multiples of 4. Thefollowing two new parameters φ_(enh,y) and φ_(base,y) may be used anddefined for the generalized resampling processes. $\begin{matrix}{\phi_{{enh},y} = \left\{ \begin{matrix}{p_{{enh},y} - 1} & {{for}\quad{enhancement}\quad{top\_ filed}\quad\left( {4\text{:}2\text{:}0} \right)} \\{p_{{enh},y} + 1} & {{for}\quad{enhancement}\quad{bot\_ field}\quad\left( {4\text{:}2\text{:}0} \right)} \\{2 \cdot p_{{enh},y}} & {{otherwise}\quad\left( {{{enhancement}\quad{frame}},\quad{{etc}.}} \right)}\end{matrix} \right.} & (5) \\{\phi_{{base},y} = \left\{ \begin{matrix}{p_{{base},y} - 1} & {{for}\quad{base}\quad{top\_ filed}\quad\left( {4\text{:}2\text{:}0} \right)} \\{p_{{base},y} + 1} & {{for}\quad{base}\quad{bot\_ field}\quad\left( {4\text{:}2\text{:}0} \right)} \\{2 \cdot p_{{base},y}} & {{otherwise}\quad\left( {{{base}\quad{frame}},\quad{{etc}.}} \right)}\end{matrix} \right.} & (6)\end{matrix}$

The parameters φ_(enh,y) and φ_(base,y) represent the chroma verticalphase position in units of ⅛ chroma samples of the enhancement and basepictures (either field or frame), respectively.

Field-to-Field Resampling

In some embodiments, it may be required that the two corresponding fieldpictures be of the same parity (i.e., either top or bottom), although,in some other embodiments, these exemplary equations may be slightlymodified to support cross-parity resampling. An exemplary applicationfor field-to-field resampling is SD-to-1080i scalable coding.

Given a luma sample in the cropping window of the enhancement field atposition (x, y) in units of single luma samples, its correspondingposition in the base field (p_(x,L)(I), p_(y,L)(y)) in units of 1/16luma samples of the base field can be derived as: $\begin{matrix}\left\{ {{\begin{matrix}{{p_{x,L}(x)} = {\left\lbrack {{\left( {x - x_{orig}} \right) \cdot w_{base} \cdot 16} + {8 \cdot \left( {w_{base} - w_{extract}} \right)}} \right\rbrack//w_{extract}}} \\{{p_{y,L}(y)} = {\left\lbrack {{\left( {y - y_{orig}^{\prime}} \right) \cdot h_{base} \cdot 16} + {8 \cdot \left( {h_{base} - h_{extract}} \right)}} \right\rbrack//h_{extract}}}\end{matrix}{with}y_{orig}^{\prime}} = {\frac{y_{orig}}{2}.}} \right. & (8)\end{matrix}$

Similarly, given a chroma sample in the enhancement picture at position(x_(c), y_(c)) in units of single chroma samples, its correspondingposition in the base picture (p_(x,c)(x_(c)), p_(y,c)(y_(c))) in unitsof 1/16 chroma samples of the base picture can be derived as$\begin{matrix}\left\{ {{\begin{matrix}{p_{x,c} = {\left\lbrack {{\left( {x_{c} - x_{{orig},c}} \right) \cdot w_{{base},c} \cdot 16} + {{4 \cdot \left( {2 + p_{{enh},x}} \right)}w_{{base},c}} - {{4 \cdot \left( {2 + p_{{base},x}} \right)}w_{{extract},c}}} \right\rbrack//w_{{extract},c}}} \\{p_{y,c} = {\left\lbrack {{\left( {{y^{\prime}}_{c} - y_{{orig},c}} \right) \cdot h_{{base},c} \cdot 16} + {{2 \cdot \left( {4 + \phi_{{enh},y}} \right)}h_{{base},c}} - {{2 \cdot \left( {4 + \phi_{{base},y}} \right)}h_{{extract},c}}} \right\rbrack//h_{{extract},c}}}\end{matrix}{with}\quad y_{\quad{{orig},\quad c}}^{\quad\prime}} = {\frac{\quad y_{\quad{{orig},\quad c}}}{\quad 2}.}} \right. & (8)\end{matrix}$

Based on the phase positions derived by Eqs. 7-8, up-sampling filterscan be chosen from a pre-defined filter table, as described in exemplaryembodiments above, for the interpolation process.

Similar to the up-sampling process, given a luma sample in thelow-resolution field at position (x, y) in units of single luma samplesof the low-resolution picture, its corresponding position in thehigh-resolution field (p_(x,L), p_(y,L)) in units of one-sixteenth lumasamples of the high-resolution field can be derived as $\begin{matrix}\left\{ \begin{matrix}{p_{x,L} = {{{16 \cdot x_{orig}} + \left\lbrack {{x \cdot w_{extract} \cdot 16} + {8 \cdot \left( {w_{extract} - w_{base}} \right)}} \right\rbrack}//w_{base}}} \\{p_{y,L} = {{{16 \cdot y_{orig}^{\prime}} + \left\lbrack {{y \cdot h_{extract} \cdot 16} + {8 \cdot \left( {h_{extract} - h_{base}} \right)}} \right\rbrack}//h_{base}}}\end{matrix} \right. & (9)\end{matrix}$

Given a chroma sample in the low-resolution field at position (x_(c),y_(c)) in units of single chroma samples of the low-resolution picture,its corresponding position in the high-resolution field p_(x,c),p_(y,c)) in units of one-sixteenth chroma samples of the high-resolutionfield can be derived as $\begin{matrix}\left\{ \begin{matrix}{p_{x,c} = {{{16 \cdot x_{{orig},c}} + \left\lbrack {{x_{c} \cdot w_{{extract},c} \cdot 16} + {{4 \cdot \left( {2 + p_{{base},x}} \right)}w_{{extract},c}} - {{4 \cdot \left( {2 + p_{{enh},x}} \right)}w_{{base},c}}} \right\rbrack}//w_{{base},c}}} \\{p_{y,c} = {{{16 \cdot y_{{orig},c}^{\prime}} + \left\lbrack {{y_{c} \cdot h_{{extract},c} \cdot 16} + {{2 \cdot \left( {4 + \phi_{{base},y}} \right)}h_{{extract},c}} - {{2 \cdot \left( {4 + \phi_{{enh},y}} \right)}h_{{base},c}}} \right\rbrack}//h_{{base},c}}}\end{matrix} \right. & (10)\end{matrix}$

Based on sampling positions derived from Eqs. 9-10, down-samplingfilters can be selected or calculated. In some embodiments, filters maybe selected from a pre-defined set of filters.

The resampling process defined in this subsection can also be used forframe-to-frame resampling, where both frames are interlaced materials.In these cases, the field-to-field resampling may be applied to the topfield pair and the bottom field pair, respectively.

Field-to-Frame Resampling

For field-to-frame resampling, in some embodiments, it may be assumedthat the base picture is a field and the enhancement picture is acoincided frame. An exemplary application for field-to-frame resamplingis SD-to-720p scalable coding.

In some embodiments, resampling design may be simplified by breakingdown each of the up- and down-sampling processes into two stages,respectively.

For up-sampling, a field is up-sampled to a first field of the sameparity using methods described above in relation to field-to-fieldresampling. Then the opposite parity field in the enhancement frame maybe generated by interpolating the first field using a symmetric even-tapfilter along the vertical direction. In some embodiments, a defaultfilter for H.264 1/2-sample interpolation may be applied.

Some embodiments of the present invention may be described withreference to FIG. 6. In these embodiments, a base layer field isupsampled 60 to an enhancement layer resolution. This upsampled field isthen interpolated or otherwise modified to generate 62 anopposite-parity field. These matching fields may be combined to form aninterlaced frame. In some embodiments, these fields may be processed toform a progressive frame.

In some embodiments of the present invention, illustrated in FIG. 7, afirst resolution sample position corresponding to a second resolutionlocation is determined 70. The first resolution sample positioncorresponding to the second resolution location may be an intermediateposition between the pixels of a first resolution layer, such as a baselayer. Based on this position, filter data may be obtained, selected orcalculated 72. This filter data may then be used to filter 74 thefirst-resolution picture to produce a second resolution picture orfield. In some embodiments, these pictures will be corresponding fieldsof an interlaced image. This second-resolution field may then beinterpolated or otherwise processed to produce an opposite-parity field76. The pair of second resolution fields may then make up an interlacedframe or may be further processed to make a progressive frame or someother image.

Some embodiments of the present invention may be described withreference to FIG. 8. In these embodiments, a first-resolution sampleposition may be determined 80 as was done in the previous exemplaryembodiment described in relation to FIG. 7. A first-resolution phaseoffset may then be calculated 82 to describe the offset between afirst-resolution pixel and the first-resolution sample position. Thisphase offset may then be used to select, obtain or calculate filter data84. This filter data may then be used to filter the first-resolutionpicture 86 thereby producing a second-resolution filtered field. Thisfiltered field may then be interpolated or otherwise processed toproduce an opposite-parity field 88. These fields may then be furtherprocessed as described for other embodiments.

For down-sampling, a frame may first be down-sampled vertically topreserve the field of the same parity of the base field picture. Thenthe enhancement field may be down-sampled following processes describedabove in relation to field-to-field resampling. In some embodiments, asimple solution for the first step vertical down-sampling is to simplydiscard all even or odd lines of the frame.

Frame-to-Field Resampling

In some embodiments, for frame-to-field resampling, it may be assumedthat the base picture is a frame and the enhancement picture is a field;additionally, the base frame may be coincided with the enhancementfield. Processes described above in relation to field-to-fieldresampling can be applied for the scenario with slight modifications.

Given a luma sample in the cropping window of the enhancement field atposition (x, y) in units of single luma samples, its correspondingposition in the base frame (p_(x,L)(x), p_(y,L)(y))in units of 1/16 lumasamples of the base frame may be derived as: $\begin{matrix}\left\{ {{\begin{matrix}{{p_{x,L}(x)} = {\left\lbrack {{\left( {x - x_{orig}} \right) \cdot w_{base} \cdot 16} + {8 \cdot \left( {w_{base} - w_{extract}} \right)}} \right\rbrack//w_{extract}}} \\{{p_{y,L}(y)} = {\left\lbrack {{\left( {y - y_{orig}^{\prime}} \right) \cdot h_{base}^{\prime} \cdot 16} + {8 \cdot \left( {h_{base}^{\prime} - h_{extract}} \right)}} \right\rbrack//h_{extract}}}\end{matrix}{with}y_{orig}^{\prime}} = {{\frac{y_{orig}}{2}\quad{and}\quad h_{base}^{\prime}} = {2 \cdot {h_{base}.}}}} \right. & (11)\end{matrix}$

Similarly, given a chroma sample in the enhancement picture at position(x_(c), y_(c)) in units of single chroma samples, its correspondingposition in the base picture (p_(x,c)(x_(c)), p_(y,c)(y_(c))) in unitsof 1/16 chroma samples of the base picture may be derived as:$\begin{matrix}\left\{ {{\begin{matrix}{{p_{x,c}\left( x_{c} \right)} = {\left\lbrack {{\left( {x_{c} - x_{{orig},c}} \right) \cdot w_{{base},c} \cdot 16} + {{4 \cdot \left( {2 + p_{{enh},x}} \right)}w_{{base},c}} - {{4 \cdot \left( {2 + p_{{base},x}} \right)}w_{{extract},c}}} \right\rbrack//w_{{extract},c}}} \\{{p_{y,c}\left( y_{c} \right)} = {\left\lbrack {{\left( {y_{c} - y_{{orig},c}^{\prime}} \right) \cdot h_{{base},c} \cdot 16} + {{2 \cdot \left( {4 + \phi_{{enh},y}} \right)}h_{{base},c}^{\prime}} - {{2 \cdot \left( {4 + \phi_{{base},y}} \right)}h_{{extract},c}}} \right\rbrack//h_{{extract},c}}}\end{matrix}{with}y_{{orig},c}^{\prime}} = {{\frac{y_{{orig},c}}{2}\quad{and}\quad h_{{base},c}^{\prime}} = {2 \cdot {h_{{base},c}.}}}} \right. & (12)\end{matrix}$

Similarly, given a luma sample in the low-resolution frame at position(x, y) in units of single luma samples of the low-resolution picture,its corresponding position in the high-resolution field (p_(x,L),p_(y,L)) in units of one-sixteenth luma samples of the high-resolutionfield may be derived as: $\begin{matrix}\left\{ {\begin{matrix}{p_{x,L} = {{{16 \cdot x_{orig}} + \left\lbrack {{x \cdot w_{extract} \cdot 16} + {8 \cdot \left( {w_{extract} - w_{base}} \right)}} \right\rbrack}//w_{base}}} \\{p_{y,L} = {{{16 \cdot y_{orig}^{\prime}} + \left\lbrack {{y \cdot h_{extract} \cdot 16} + {8 \cdot \left( {h_{extract} - h_{base}^{\prime}} \right)}} \right\rbrack}//h_{base}^{\prime}}}\end{matrix}.} \right. & (13)\end{matrix}$

Given a chroma sample in the low-resolution frame at position (x_(c),y_(c)) in units of single chroma samples of the low-resolution picture,its corresponding position in the high-resolution field (p_(x,c),p_(y,c)) in units of one-sixteenth chroma samples of the high-resolutionfield may be derived as: $\begin{matrix}\left\{ {\begin{matrix}{p_{x,c} = {{{16 \cdot x_{{orig},c}} + \left\lbrack {{x_{c} \cdot w_{{extract},c} \cdot 16} + {{4 \cdot \left( {2 + p_{{base},x}} \right)}w_{{extract},c}} - {{4 \cdot \left( {2 + p_{{enh},x}} \right)}w_{{base},c}}} \right\rbrack}//w_{{base},c}}} \\{p_{y,c} = {{{16 \cdot y_{{orig},c}^{\prime\quad}} + \left\lbrack {{y_{c} \cdot h_{{extract},c} \cdot 16} + {{2 \cdot \left( {4 + \phi_{{base},y}} \right)}h_{{extract},c}} - {{2 \cdot \left( {4 + \phi_{{enh},y}} \right)}h_{{base},c}^{\prime}}} \right\rbrack}//h_{{base},c}^{\prime}}}\end{matrix}.} \right. & (14)\end{matrix}$

Eqs. 11-14 can also be backward compatible with Eqs. 1-4, and therefore,the proposed field-to-frame resampling process may be related to theframe-to-frame resampling process.

Frame-to-Frame Resampling—Special Case

A special case for frame-to-frame resampling is that the baseprogressive frame is coincided with a field of the enhancement frame (oftwo fields with different presentation time codes).

The two-stage processes described above in relation to field-to-frameresampling can be applied together with the process described above inrelation to frame-to-field resampling.

Interpolation—Up-Sampling

In some embodiments, interpolating the enhancement-layer image value atsample position (x, y) in the enhancement spatial picture comprises afiltering process. The filtering process may further comprisedetermining interpolation-filter coefficients from a look-up-tablewherein the index into the look-up-table may be related to theinterpolation position determined by (p_(x,L)(x), p_(y,L)(y)).

In some embodiments, the interpolation filter may be a 4-tap filter. Insome embodiments, the interpolation filter may be a 6-tap filter. Insome embodiments, the filter coefficents may be dirived from thetwo-lobed or three-lobed Lanczos-windowed sinc functions.

Table 1 and Table 2 comprise exemplary look-up-tables ofinterpolation-filter coefficents for a 16-phase 6-tap interpolationfilter wherein the phase corresponds to the interpolation positiondetermined by (p_(x,L)(x), p_(y,L)(y)). TABLE 1 (6-tap) interpolationfilter coefficients phase e[−2] e[−1] e[0] e[1] e[2] e[3] 0 0 0 32 0 0 01 0 −2 32 2 0 0 2 1 −3 31 4 −1 0 3 1 −4 30 7 −2 0 4 1 −4 28 9 −2 0 5 1−5 27 11 −3 1 6 1 −5 25 14 −3 0 7 1 −5 22 17 −4 1 8 1 −5 20 20 −5 1 9 1−4 17 22 −5 1 10 0 −3 14 25 −5 1 11 1 −3 11 27 −5 1 12 0 −2 9 28 −4 1 130 −2 7 30 −4 1 14 0 −1 4 31 −3 1 15 0 0 2 32 −2 0

TABLE 2 (6-tap) interpolation filter coefficients phase e[−2] e[−1] e[0]e[1] e[2] e[3] 0 0 0 32 0 0 0 1 0 −2 32 2 0 0 2 1 −3 31 4 −1 0 3 1 −4 306 −1 0 4 1 −4 28 9 −2 0 5 1 −4 27 11 −3 0 6 1 −5 25 14 −3 0 7 1 −5 22 17−4 1 8 1 −5 20 20 −5 1 9 1 −4 17 22 −5 1 10 0 −3 14 25 −5 1 11 0 −3 1127 −4 1 12 0 −2 9 28 −4 1 13 0 −1 6 30 −4 1 14 0 −1 4 31 −3 1 15 0 0 232 −2 0

Table 3 comprises a look-up-table of interpolation-filter coefficientsfor a 16-phase 4-tap interpolation filter wherein the phase correspondsto the interpolation position determined by (p_(x,L),(x), p_(y,L)(y)).TABLE 3 (4-tap) interpolation filter coefficient phase e[−1] e[0] e[1]e[2] 0 0 128 0 0 1 −4 127 5 0 2 −8 124 13 −1 3 −10 118 21 −1 4 −11 11130 −2 5 −11 103 40 −4 6 −10 93 50 −5 7 −9 82 61 −6 8 −8 72 72 −8 9 −6 6182 −9 10 −5 50 93 −10 11 −4 40 103 −11 12 −2 30 111 −11 13 −1 21 118 −1014 −1 13 124 −8 15 0 5 127 −4

Some embodiments of the current invention are illustrated in FIG. 3.Interpolation in the x- and y-directions may be done in separate passes,300 and 30, respectively. In some embodiments, each pass may beperformed within a macroblock or another sub-division of the image. Inother embodiments, each pass may be performed within the entire image.

For a sample position in the enhancement layer 31, i.e., the location ofan enhancement-layer pixel, the corresponding position in the base layer32 may be determined 301. The offset, or phase, in each direction,y-position phase 33 and x-position phase 34, of the sample in the baselayer from an integer base-layer pixel location may be determined, 302and 303, respectively, from the corresponding base-layer pixel position32 of an enhancement-layer pixel position 31. The offset or phase may bedetermined in units of interpolation resolution. For example, for aninterpolation resolution of one-sixteenth, a phase of 0 may correspondto no offset from a base-layer pixel position. A phase of 8 maycorrespond to an enhancement-layer pixel that falls, in one dimension,half-way between base-layer pixel positions.

The interpolation filter coefficients may be determined by look-up-tablein which the y-position phase 33 may be the index when interpolating inthe y-direction, or the x-position phase 34 may be the index wheninterpolating in the x-direction. The position interpolation center, fora given direction, is the pixel location in the base layer with respectto which the position phase may be measured. In some embodiments of thecurrent invention, the position interpolation center is the pixellocation to which the filter is centered.

Interpolation—Down-Sampling

Down-sampling filters can be selected from a pre-defined set of filters,in some embodiments. Down-sampling filters may also be calculatedon-the-fly or modified in relation to position or other imageparameters. However, the down-sampling process is not a normative partof the SVC.

Syntax and Signaling

Some embodiments of the present invention may be described withreference to FIG. 9. In these embodiments, a resampling mode isdetermined 90 for a spatially-scalable picture. This is typically doneduring the encoding process. Since multiple mode are available, thisresampling mode must be signaled 92 with the encoded image. In someembodiments, the resampling mode may be encoded with the image bitstreamas a flag. Some exemplary embodiments are described in the followingsections.

New Syntax Element in Sequence Parameter Set of SVC C Descriptorseq_parameter_set_rbsp( ) { ... if(profile_idc = = 83 ){ ...base_frame_progressive_only_flag 0 u(1) if(base_frame_progressive_only_flag ) base_frame_from_field 0 u(2) ... }... }

In some embodiments of the present invention, which are related to theJVT standard, to support all possible cases described above, a newsequence-level signal should be added to the Sequence Parameter Set(SPS) of the SVC extension. The flag (for example,“base_frame_progressive_only_flag”) is to specify whether the base layerpictures are either all progressive frames or all interlaced materials.

For interlaced base pictures, processes described above in relation tofield-to-field resampling or field-to-frame resampling may be invoked.(Note: this will cover the case of adaptive frame/field coding.)

When the base layer pictures are all progressive frames, another signal(for example, “base_frame_from_field”) may be sent to specify whether abase frame picture is derived from a top field, a bottom field, or aframe in the enhancement frame. And, processes described above inrelation to frame-to-field resampling or the frame-to-frame special casemay be invoked for interlaced enhancement pictures. Of course, bydefault, the existing resampling procedure may be applied to handleresampling between progressive frames.

In some embodiments, the signals may be sent explicitly for each SPS.However, in alternative embodiments, they may be sent only when the“frame_mbs_only_frame” of the enhancement picture is “0”.

In some embodiments, the same signals can also be used for otherinter-layer prediction decisions (especially inter-layer mode/motionprediction) in addition to the texture prediction.

New Syntax Element in Slice Header of SVC ▪ → →if(·base_frame_mbs_only_flag·&&·!frame_mbs_only_flag·&&·

¤ ¤ ¤ → → → !field_pic_flag)¤ ▪ → → → base_frame_from_bottom_field_flag¤2¤ u(1)¤ ¤ ▪ → →else·if(·frame_mbs_only_flag·&&·!base_frame_mbs_only_flag·&&·

¤ ¤ ¤ → → →!base_field_pic_flag·)¤ ▪ → → →base_bottom_field_coincided_flag¤ 2¤ u(1)¤ ¤

In some embodiments, a more flexible option for the signaling is to havea couple single-bit flags inserted to the SVC slice header, e.g.,base_frame_from_bottom_field_flag and base_bottom_field_coincided_flag.

In some exemplary embodiments, which are related to the JVT standard,one or more of the following flags may be used:

-   base_frame_from_bottom_field_flag equal to 0 indicates the base    frame picture (i.e., the frame picture with dependency_id equal to    DependencyIdBase) is derived from a top field picture in the current    layer. base_frame_from_bottom_field_flag equal to 1 indicates the    base frame picture (i.e., the frame picture with dependency_id equal    to DependencyIdBase) is derived from a bottom field picture in the    current layer. When base_frame_from_bottom_field_flag is not    present, it shall be inferred to be equal to 0.-   base_bottom_field_coincided_flag equal to 0 indicates the top field    of the base frame picture (i.e., the frame picture with    dependency_id equal to DependencyIdBase) is coincided with the    current frame picture in the current layer.    base_bottom_field_coincided_flag equal to 1 indicates the bottom    field of the base frame picture (i.e., the frame picture with    dependency_id equal to DependencyIdBase) is coincided with the    current frame picture in the current layer. When    base_bottom_field_coincided_flag is not present, it shall be    inferred to be equal to 0.

The related existing H.264 symbols are defined as follows.

-   frame_mbs_only_flag specifies whether the sequence of the current    layer is coded as progressive frame pictures or interlaced    frame/field pictures.-   base_frame_mbs_only_flag is equal to frame_mbs_only_flag of the base    picture.-   field_pic_flag specifies whether the current picture is a field    picture or frame picture.-   base_field_pic_flag is equal to field_pic_flag of the base picture.    Alternative Resamplig Filters

Interlaced-materials-related embodiments described above comprisesolutions for resampling phase position issues, however the specificfilter design is not fixed. In some embodiments, the default directinterpolation filters described above and in the incorporated JVTdocuments in relation to non-interlaced materials may be applied toup-sampling of interlaced materials as well. However, there is no anyrestriction on alternative filter designs, including sequence- orpicture-level adaptive filter selections for other embodiments.

Other Embodiments

As specified in the SVC, the residual interpolation positions may bederived using the same equations for texture interpolation; however, thesimple bilinear interpolation filters should be applied for the residualup-sampling in SVC-related embodiments to be consistent with current SVCdesign.

Many equations in this document have been explicitly written for directinterpolation with 1/16-sample accuracy. In other embodiments, theresampling accuracy may be modified to other than 1/16 sample.

The terms and expressions which have been employed in the foregoingspecification are used therein as terms of description and not oflimitation, and there is no intention in the use of such terms andexpressions of excluding equivalence of the features shown and describedor portions thereof, it being recognized that the scope of the inventionis defined and limited only by the claims which follow.

1. A method for picture resampling from a first resolution picture to asecond resolution frame said method comprising: a) deriving a firstresolution sample position corresponding to a sample position in asecond resolution field; b) obtaining filter data wherein said filterdata is related to said first resolution sample position; and c)filtering said first resolution picture with a filter based on saidfilter data thereby producing a filtered field.
 2. A method as describedin claim 1 further comprising repeating steps a-c for an opposite-parityfield.
 3. A method as described in claim 1 further comprisinginterpolating said filtered field, thereby producing an opposite parityfield.
 4. A method as described in claim 1 wherein said first resolutionpicture is an interlaced field.
 5. A method as described in claim 1wherein said filter data comprises a filter coefficient.
 6. A method asdescribed in claim 1 wherein said deriving a first resolution sampleposition comprises determining a phase for said for said firstresolution position.
 7. A method as described in claim 6 wherein saidfilter data is related to the phase of said first resolution sampleposition.
 8. A method as described in claim 1 wherein obtaining filterdata comprises obtaining first filter data for a first direction basedon said first resolution sample position and obtaining second filterdata for a second direction based on said first resolution sampleposition.
 9. A method as described in claim 6 wherein obtaining filterdata comprises obtaining first filter data for a first direction basedon a phase of said first resolution sample position and obtaining secondfilter data for a second direction based on a phase of said firstresolution sample position.
 10. A method for signaling aspatially-scalable resampling mode, said method comprising: a)determining a resampling mode for generating a second-resolution picturefrom a first-resolution input picture, wherein said second-resolutionpicture is to be encoded as a representation of said input picture fromwhich a first resolution layer and a second resolution layer can bedecoded; and b) signaling said resampling mode with at least one flagthat is encoded with said representation.
 11. A method as described inclaim 10 wherein said resampling mode comprises deriving saidsecond-resolution picture from a top field of said first-resolutioninput picture.
 12. A method as described in claim 10 wherein saidresampling mode comprises deriving said second-resolution picture from abottom field of said first-resolution input picture.
 13. A method asdescribed in claim 10 wherein said resampling mode comprises derivingsaid second-resolution picture from a frame of said first-resolutioninput picture.
 14. A method as described in claim 10 wherein saidresampling mode comprises deriving said second-resolution picture from atop field of said first-resolution input picture.
 15. A system forpicture resampling from a first resolution picture to a secondresolution frame said system comprising: a) a position calculator forderiving a first resolution sample position corresponding to a sampleposition in a second resolution field; b) a data manager for managingfilter data wherein said filter data is related to said first resolutionsample position; and c) a filter for filtering said first resolutionpicture based on said filter data thereby producing a filtered field.16. A system as described in claim 15 further comprising an interpolatorfor interpolating said filtered field, thereby producing an oppositeparity field.
 17. A system as described in claim 15 further comprising:a) an opposite-parity position calculator for deriving anopposite-parity first resolution sample position corresponding to anopposite-parity sample position in an opposite-parity second resolutionfield; b) wherein said data manager is also capable of managingopposite-parity filter data wherein said opposite-parity filter data isrelated to said opposite-parity first resolution sample position; and c)an opposite-parity filter for filtering a first resolutionopposite-parity field based on said filter data thereby producing anopposite-parity filtered field.
 18. A system as described in claim 15wherein said first resolution picture is an interlaced field.
 19. Asystem as described in claim 15 wherein said filter data comprises afilter coefficient.
 20. A system as described in claim 15 furthercomprising a phase calculator for determining a phase of said firstresolution sample position.